The likelihood is
\[ y_i \sim \mathcal{N}(\mu, \sigma) \]
Two parameters are in the posterior distribution: \(\mu\) and \(\sigma\).
\[ \mathrm{P}(\mu, \sigma \mid y) = \frac{\prod_i\mathcal{N}(y_i \mid \mu, \sigma )\mathcal{N}(\mu \mid 0, 10)\mathrm{Exp}(\sigma \mid 1)}{\int\int\prod_i\mathcal{N}(y_i \mid \mu, \sigma )\mathcal{N}(\mu \mid 0, 10)\mathrm{Exp}(\sigma \mid 1) d\mu d\sigma} \]
The linear model is \(\mu_i = \alpha + \beta x_i\).
Three parameters: \(\alpha\), \(\beta\) and \(\sigma\). \(\mu\) is now determined by \(\alpha\) and \(\beta\).
## ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.3 ✓ purrr 0.3.4
## ✓ tibble 3.1.1 ✓ dplyr 1.0.2
## ✓ tidyr 1.1.2 ✓ stringr 1.4.0
## ✓ readr 1.3.1 ✓ forcats 0.5.0
## ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
\[ y \sim \mathcal{N}(\mu, \sigma) \\ \mu = \alpha + \beta x \\ \alpha \sim \mathcal{N}(0, 10) \\ \beta \sim \mathcal{U}(0, 1) \\ \sigma \sim \mathrm{Exp}(1) \]
We choose \(\alpha\) to be a reasonable distribution for adults, and \(\beta\) to be a reasonable distibution of slope centered on zero, because we have no reason to believe right now that height is related to year.
\[ h_i \sim \mathcal{N}(\mu_i, \sigma) \\ \mu_i = \alpha + \beta(x_i - \bar{x}) \\ \alpha \sim \mathcal{N}(170, 20) \\ \beta \sim \mathcal{N}(0, 1) \\ \sigma \sim \mathrm{Exp}(1) \]
If every student got taller each year, then the mean height would get taller each year, which means that \(\beta\) would have some positive slope. So we would adjust \(\beta\) to be centered on a number greater than zero.
If the variance is less than 64 then the standard deviation is less than 8. Let’s take a look at the current distribution for \(sigma\).
OK, so this seems to discount any \(\sigma\) > 4. We might want to adjust our rate a bit:
That seems a bit better, so would probably change my rate in the exponential distribution to \(0.5\).
## Loading required package: rstan
## Loading required package: StanHeaders
## rstan (Version 2.21.2, GitRev: 2e1f913d3ca3)
## For execution on a local, multicore CPU with excess RAM we recommend calling
## options(mc.cores = parallel::detectCores()).
## To avoid recompilation of unchanged Stan programs, we recommend calling
## rstan_options(auto_write = TRUE)
##
## Attaching package: 'rstan'
## The following object is masked from 'package:tidyr':
##
## extract
## Loading required package: parallel
## rethinking (Version 2.13)
##
## Attaching package: 'rethinking'
## The following object is masked from 'package:purrr':
##
## map
## The following object is masked from 'package:stats':
##
## rstudent
## a b sigma
## a 7.306610e-02 -3.849183e-08 6.156994e-05
## b -3.849183e-08 1.754885e-03 -2.284378e-05
## sigma 6.156994e-05 -2.284378e-05 3.653997e-02
## a b sigma
## a 3.596097143 -0.0783189936 0.0092343716
## b -0.078318994 0.0017410879 -0.0002015824
## sigma 0.009234372 -0.0002015824 0.0365748427
Centering around the mean has almost eliminated the covariances. Let’s compare posterior models:
The posteriors are pretty much identical whether weight-centered or not.
It appears that the combination of knots and weight width influences the general wiggliness of the fit lines.
## weight predicted_height lwr upr
## 1 46.95 156.0548 147.9660 163.8110
## 2 43.72 153.6140 146.0450 161.4261
## 3 64.78 178.9107 171.0117 187.1157
## 4 32.59 143.3635 136.0379 150.9784
## 5 54.63 162.9876 155.8075 170.6068
Let’s get the data first:
Now let’s define a linear model - we choose our priors on the basis that we have no idea about a child’s height:
Let’s get an estimate for b:
## mean sd 5.5% 94.5%
## a 189.116450 2.13815496 185.699266 192.533635
## b 2.685287 0.06809552 2.576457 2.794117
## sigma 8.443203 0.43148257 7.753610 9.132795
For every 10 units of weight, the model predicts an additional height of around 27cm. Let’s chart this model.
Weights seems to have a linear relationship with height in the mid-range of approx 15-35 weight units, but the model underestimates height for this range. This is because the model overestimates height at the extreme ends. It looks like a parabolic model might be a better fit here. I would try a quadratic term next.
This is a remarkably good fit. So each multiple of weight by \(e\) results in an increase in height of approx 49cm. This would explain the shape of the fit. Below we estimate the model and plot it.
Let’s simulate our prior for various values of our parameters:
Many of these priors are way outside the boundaries of reality. Now we learned from the previous question that when the relationship is linear on the log, the intercept is negative let’s try a negative value for \(\alpha\):
Hasn’t helped a lot, the spread is very wide, so we will reduce the spread of \(\alpha\) and of the other coefficients:
This is better. But we need to make sure that the curves all move upwards. So we should change the distribution for \(\beta_1\) to a plain normal distribution to restrict it more. After playing around with a lot of the coefficient distributions we get to this, which seems a reasonable prior:
First we try a simple linear model:
Let’s simulate the MAP, 97% confidence interval for \(\mu\) and 97% prediction interval.
And graph the results:
This suggests that higher temperature influences earlier blooming. Now we try a quadratic model:
This suggests no major relationship. Let’s try a cubic:
This suggests that there is a range that does not affect first bloosom day, but that highjer temperatures outside this range relate to earlier blossoming.
Finally lets try a splines model based on cubic basis functions:
Let’s look at a similar relationship between year and March temperature:
Finally let’s pull a previous analysis so we can line these up nicely in a way that suggests possible temperature causality: